• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ ³í¹®Áö

Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ ³í¹®Áö

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ¹®¼­ Ŭ·¯½ºÅ͸¦ À§ÇÑ ¿öµå³Ý±â¹ÝÀÇ ´ëÇ¥ ·¹ÀÌºí ¼±Á¤ ¹æ¹ý
¿µ¹®Á¦¸ñ(English Title) Representative Labels Selection Technique for Document Cluster using WordNet
ÀúÀÚ(Author) ±èÅÂÈÆ   ¼Õ¹Ì¾Ö   Tae-Hoon Kim   Mye Sohn  
¿ø¹®¼ö·Ïó(Citation) VOL 18 NO. 02 PP. 0061 ~ 0073 (2017. 04)
Çѱ۳»¿ë
(Korean Abstract)
º» ¿¬±¸¿¡¼­´Â ¹®¼­ Ŭ·¯½ºÅ͸µ °á°ú µµÃâµÈ °³º° Ŭ·¯½ºÅÍ°¡ ÇÔÃàÇÏ°í ÀÖ´Â Àǹ̸¦ ÆľÇÇÏ´Â µ¥ ÇÊ¿äÇÑ ¾îÈÖµéÀÇ Á¤º¸·®À» È°¿ëÇÑ ¹®¼­ Ŭ·¯½ºÅÍ ·¹ÀÌºí¸µ(Documents Cluster Labeling) ¹æ¹ýÀ» Á¦¾ÈÇÏ¿´´Ù. À̸¦ À§ÇØ, Ŭ·¯½ºÅÍ¿¡ Æ÷ÇÔµÈ ¾îÈÖµéÀÌ ÇØ´ç Ŭ·¯½ºÅÍ¿¡¼­ ¾ó¸¶³ª Áß¿äÇÑ ºñÁßÀ» Â÷ÁöÇÏ°í ÀÖ´ÂÁö ÆľÇÇϱâ À§ÇÏ¿© °¢ ¾îÈÖÀÇ ÃâÇö ºóµµ¿Í Á¤º¸·®À» ÀÌ¿ëÇÑ ¾îÈÖÀÇ °¡ÁßÄ¡¸¦ °è»êÇÑ ÈÄ, ¿öµå³ÝÀ» ÀÌ¿ëÇÏ¿© Ŭ·¯½ºÅÍ¿¡ Æ÷ÇÔµÈ ¾îÈÖµéÀÇ ÃÖ±ÙÁ¢ °øÅë »óÀ§¾î¸¦ Èĺ¸ ·¹À̺í·Î ½Äº°ÇÏ¿´´Ù. ÀÌ»óÀÇ °úÁ¤À» °ÅÃÄ ½Äº°µÈ Èĺ¸ ·¹À̺íÀÇ Á¤º¸·®°ú Ŭ·¯½ºÅͳ»¿¡¼­ÀÇ Áß¿äµµ °¡ÁßÄ¡¸¦ È°¿ëÇØ, ÇØ´ç Ŭ·¯½ºÅÍÀÇ ÀÇ¹Ì¿Í Æ¯Â¡À» Æ÷°ýÀûÀ¸·Î Ç¥ÇöÇÒ ¼ö ÀÖ´Â ´ëÇ¥ ·¹À̺íÀ» °áÁ¤ÇÏ¿´´Ù. º» ¿¬±¸ÀÇ ¿ì¼ö¼ºÀ» ÀÔÁõÇϱâ À§ÇØ ´ÙÀ½°ú °°Àº ½ÇÇèÀ» ¼öÇàÇÏ¿´´Ù. ½ÇÇèÀº º» ¿¬±¸¿¡¼­ Á¦¾ÈÇÑ ¹æ¹ý¿¡ µû¶ó ¼±Á¤µÈ ·¹À̺í°ú Èĺ¸ ·¹À̺íÀ» ¿öµå³Ý¿¡ ÇÁ·ÎÁ§¼ÇÇÑ ÈÄ, ¿öµå³Ý»ó¿¡¼­ ÀÌµé ·¹À̺íÀÇ À§Ä¡(±íÀÌ)¸¦ È®ÀÎÇÏ¿´´Ù. ¶ÇÇÑ ¼±Á¤µÈ Èĺ¸ ·¹À̺íÀ» »óÀ§¾î·Î °®°í Àִ Ŭ·¯½ºÅÍ ³» ¾îÈÖÀÇ ¼ö¸¦ µµÃâÇÏ¿©, ÈÞ¸®½ºÆ½ ¹æ¹ý¿¡ µû¶ó ¼±Á¤µÈ ·¹À̺íÀ» Àü¹®°¡°¡ ãÀº ´ëÇ¥ ·¹À̺í°úÀÇ ºñ±³¸¦ ¼öÇàÇÏ¿´´Ù. Æò°¡ÁöÇ¥·Î Èĺ¸ ·¹À̺íÀÇ ÀûÇÕ¼º(suitability)°ú ´ëÇ¥ ·¹À̺íÀÇ ÀûÀý¼º(appropriacy)À» È°¿ëÇÏ¿´´Ù. ½ÇÇè °á°ú, º» ¿¬±¸¿¡¼­ Á¦¾ÈÇÑ ¹æ¹ýÀ» Àû¿ëÇØ ¹®¼­ Ŭ·¯½ºÅÍ ·¹ÀÌºí¸µÀ» ¼öÇàÇÒ °æ¿ì, Èĺ¸ ·¹À̺íÀÇ ÀûÇÕ¼ºÀÇ °æ¿ì ±âÁ¸ÀÇ ¹æ¹ýº¸´Ù ¾à°£ °¨¼ÒÇÏÁö¸¸ °è»ê·®ÀÌ ±âÁ¸ ¹æ¹ýÀÇ ¾à 20% Á¤µµ·Î °¨¼ÒÇÏ¿´À¸¸ç, ´ëÇ¥ ·¹À̺íÀÇ ÀûÀý¼ºÀÇ °æ¿ì ±âÁ¸ÀÇ ¹æ¹ýº¸´Ù ¿ì¼öÇÑ °á°ú¸¦ µµÃâÇÏ´Â °ÍÀ» È®ÀÎÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
In this paper, we propose a Documents Cluster Labeling method using information content of words in clusters to understand what the clusters imply. To do so, we calculate the weight and frequency of the words. These two measures are used to determine the weight among the words in the cluster. As a nest step, we identify the candidate labels using the WordNet. At this time, the candidate labels are matched to least common hypernym of the words in the cluster. Finally, the representative labels are determined with respect to information content of the words and the weight of the words. To prove the superiority of our method, we perform the heuristic experiment using two kinds of measures, named the suitability of the candidate label (suitability) and the appropriacy of representative label (appropriacy). In applying the method proposed in this research, in case of suitability of the candidate label, it decreases slightly compared with existing methods, but the computational cost is about 20% of the conventional methods. And we confirmed that appropriacy of the representative label is better results than the existing methods. As a result, it is expected to help data analysts to interpret the document cluster easier.
Å°¿öµå(Keyword) ¹®¼­ Ŭ·¯½ºÅÍ ·¹ÀÌºí¸µ   Á¤º¸·®   ¿öµå³Ý   À¯»çµµ °è»ê   Documents Cluster Labeling   Information content   WordNet   Similarity Calculation  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå